Learning Multiple Behaviors from Unlabeled Demonstrations in a Latent Controller Space
نویسندگان
چکیده
In this paper we introduce a method to learn multiple behaviors in the form of motor primitives from an unlabeled dataset. One of the difficulties of this problem is that in the measurement space, behaviors can be very mixed, despite existing a latent representation where they can be easily separated. We propose a mixture model based on a Dirichlet Process (DP) to simultaneously cluster the observed time-series and recover a sparse representation of the behaviors using a Laplacian prior as the base measure of the DP. We show that for linear models, e.g potential functions generated by linear combinations of a large number of features, it is possible to compute analytically the marginal of the observations and derive an efficient sampler. The method is evaluated using robot behaviors and real data from human motion and compared to other techniques.
منابع مشابه
InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations
The goal of imitation learning is to mimic expert behavior without access to an explicit reward signal. Expert demonstrations provided by humans, however, often show significant variability due to latent factors that are typically not explicitly modeled. In this paper, we propose a new algorithm that can infer the latent structure of expert demonstrations in an unsupervised way. Our method, bui...
متن کاملInferring The Latent Structure of Human Decision-Making from Raw Visual Inputs
The goal of imitation learning is to mimic expert behavior without access to an explicit reward signal. Expert demonstrations provided by humans, however, often show significant variability due to latent factors that are typically not explicitly modeled. In this paper, we propose a new algorithm that can infer the latent structure of expert demonstrations in an unsupervised way. Our method, bui...
متن کاملTime-Contrastive Networks: Self-Supervised Learning from Video
We propose a self-supervised approach for learning representations and robotic behaviors entirely from unlabeled videos recorded from multiple viewpoints, and study how this representation can be used in two robotic imitation settings: imitating object interactions from videos of humans, and imitating human poses. Imitation of human behavior requires a viewpoint-invariant representation that ca...
متن کاملImitation and Reinforcement Learning from Failed Demonstrations
Current work in robotic imitation learning uses successful demonstrations of a task performed by a human teacher to initialize a robot controller. Given a reward function, this learned controller can then be improved using techniques derived from reinforcement learning. We instead use failed attempts, which may be more plentiful, to initialize our controller and, taking them as illustrations of...
متن کاملInverse Reinforce Learning with Nonparametric Behavior Clustering
Inverse Reinforcement Learning (IRL) is the task of learning a single reward function given a Markov Decision Process (MDP) without defining the reward function, and a set of demonstrations generated by humans/experts. However, in practice, it may be unreasonable to assume that human behaviors can be explained by one reward function since they may be inherently inconsistent. Also, demonstration...
متن کامل